Frequency warping and robust speaker verification: a comparison of alternative mel-scale representations

نویسندگان

Tomi Kinnunen

Md. Jahangir Alam

Pavel Matejka

Patrick Kenny

Jan Cernocký

Douglas D. O'Shaughnessy

چکیده

Accuracy of speaker verification is high under controlled conditions but falls off rapidly in the presence of interfering sounds. This is because spectral features, such as Mel-frequency cepstral coefficients (MFCCs), are sensitive to additive noise. MFCCs are a particular realization of warped-frequency representation with low-frequency focus. But there are several alternative, potentially more robust, warped-frequency representations. We provide an experimental comparison of five warped-frequency features. They use exactly the same frequency warping function, the same number of coefficients and postprocessing, but differ in their internal computations. The compared variants are (1) conventional MFCCs from discrete Fourier transform (DFT), followed by Mel-scaled filterbank, (2) MFCCs via direct warping of DFT, followed by linear-scale filterbank, (3) warped linear prediction features, (4) perceptual minimum variance distortionless features and (5) recently proposed sparse Mel-scale histogram features. Experiments carried out on a subset of the SRE 10 corpus using a scaled-down i-vector system indicate that direct DFT warping outperforms conventional MFCCs in most of the cases.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

We propose a system to real environmental noise and channel mismatch for forensic speaker verification systems. This method is based on suppressing various types of real environmental noise by using independent component analysis (ICA) algorithm. The enhanced speech signal is applied to mel frequency cepstral coefficients (MFCC) or MFCC feature warping to extract the essential characteristics o...

متن کامل

Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models

To improve the performance of speaker identification systems, an effective and robust method is proposed to extract speech features, capable of operating in noisy environment. Based on the time-frequency multi-resolution property of wavelet transform, the input speech signal is decomposed into various frequency channels. For capturing the characteristic of the signal, the Mel-Frequency Cepstral...

متن کامل

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Speaker verification from talking a few words of sentences has many applications. Many methods as DTW, HMM, VQ and MQ can be used for speaker verification. We applied MQ for its precise, reliable and robust performance with computational simplicity. We also used pitch frequency and log gain contour for further improvement of the system performance.

متن کامل

Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition

An experimental study of the application of scale-transform to improve the performance of speaker independent continuous speech recognition, is presented in this paper. Three major results are described. First, a comparison was made between the scale-transform based magnitude cepstrum coeÆcients (STCC) and mel-scale lter bank cepstrum coeÆcients (MFCC) on a telephone based connected digit recog...

متن کامل

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Frequency warping and robust speaker verification: a comparison of alternative mel-scale representations

نویسندگان

چکیده

منابع مشابه

Forensic Speaker Verification in Noisy Environmental by Enhancing the Speech Signal Using ICA Approach

Wavelet-Based Mel-Frequency Cepstral Coefficients for Speaker Identification using Hidden Markov Models

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

Exploiting frequency-scaling invariance properties of the scale transform for automatic speech recognition

Using Exciting and Spectral Envelope Information and Matrix Quantization for Improvement of the Speaker Verification Systems

عنوان ژورنال:

اشتراک گذاری